Training Error, Generalization Error and Learning Curves in Neural Learning

نویسنده

  • Shun-ichi Amari
چکیده

A ,neural ‘nM?tUJOrk is tro.in,ed by using CL set of ~Z’lJaib nble exn.m,ples to minimize the twining error sv.ch th,nt the network pnra,meters ,fit the eznmples well. However, it is desired to min.imize the generalization error to which no direct access is possible. There are discrepa,ncies between the training error an.d the gen,eralization error due to th,e statistical fluctuation of exnmples. The prese,nt talk focusses on this problem! from th,e statistical point of view. When th,e number of tr,ining exa,mSples is large, we hnve a urzieersal asym,ptotic ewalua,tion on, the discrepclncies of th,e two errors. This can be u,sed for model selection based on the infor,mation criterionb Wh.en th,e nu,mber of training exam.pies is smnll. their discrepancies are big, ca.using a seriou.s oueyfitting or orrertmi,nin,g problem,. We annlyze th,is phenom,enon by usin,g a, simple m,odel. It is s’urprisin,g tha.t the genernlizntion error even increases as the number o,f exa,m,ples kcreases in a certain range. This shows th,e in,a.dequ.acy of the minimum tminGng error 1earnin.g m.ethod. We eva,luate various means overcoming th,e overtrain,in,g such as cross-valida,ted early stopping o,f tmining, introdwtion of th.e regulnrizntion terms, model selection, and oth,ers. Let N(w) 1~ a multilayer feed forward neural net,work which is specified by a modifiable parameter vect,or w. The network is t,rained by a number of examples in t,he t,raining set, Dj = {(al, %I),. . . , (zt, %,)I, where 2; is the input and d; is t,lie corresponding output of t,lie ith example. Learning is a sequential procedure t,o search for W which fit,s t,lie examples well. This can be att,ained by minimizing the training error et,raill(W) = f $l(zi,%i.; W) t=l where l(z;, zj,; w) is t,lir error or loss function whose typical case is the least square error l(z;,%&W) = ;I% f(z,w)12 in terms of the output f(z, w) of the network S(w). However, learning is necessary for processing future examples, so that it is desired to minimize the generalization error egf3l(w) = E[l(z, %; w>l, where the expectation is taken over the possible future examples. To minimize the generalization error is clifferent from to minimize the training error cvcn when the training examples and future examples are snhject to the same probability distribution. To analyze their discrepancy, we use the stochastic srtting such that input z is generated randomly from a fixed (but unknown) probability distribution q(z) and tllat the behavior of network N(w) is stochastic, t,hc nut.put z being randomly generated from the conditional distribution ~I(z~z; w). When z is a noisy version of the output f(z, w), we have

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modeling of measurement error in refractive index determination of fuel cell using neural network and genetic algorithm

Abstract: In this paper, a method for determination of refractive index in membrane of fuel cell on basis of three-longitudinal-mode laser heterodyne interferometer is presented. The optical path difference between the target and reference paths is fixed and phase shift is then calculated in terms of refractive index shift. The measurement accuracy of this system is limited by nonlinearity erro...

متن کامل

Two Novel Learning Algorithms for CMAC Neural Network Based on Changeable Learning Rate

Cerebellar Model Articulation Controller Neural Network is a computational model of cerebellum which acts as a lookup table. The advantages of CMAC are fast learning convergence, and capability of mapping nonlinear functions due to its local generalization of weight updating, single structure and easy processing. In the training phase, the disadvantage of some CMAC models is unstable phenomenon...

متن کامل

Learning and generalisation in radial basis function networks

The two-layer radial basis function network, with fixed centers of the basis functions, is analyzed within a stochastic training paradigm. Various definitions of generalization error are considered, and two such definitions are employed in deriving generic learning curves and generalization properties, both with and without a weight decay term. The generalization error is shown analytically to ...

متن کامل

Universal learning curves of support vector machines.

Using methods of statistical physics, we investigate the role of model complexity in learning with support vector machines (SVMs), which are an important alternative to neural networks. We show the advantages of using SVMs with kernels of infinite complexity on noisy target rules, which, in contrast to common theoretical beliefs, are found to achieve optimal generalization error although the tr...

متن کامل

Cystoscopy Image Classication Using Deep Convolutional Neural Networks

In the past three decades, the use of smart methods in medical diagnostic systems has attractedthe attention of many researchers. However, no smart activity has been provided in the eld ofmedical image processing for diagnosis of bladder cancer through cystoscopy images despite the highprevalence in the world. In this paper, two well-known convolutional neural networks (CNNs) ...

متن کامل

Average-Case Learning Curves for Radial Basis Function Networks

The application of statistical physics to the study of the learning curves of feedforward connectionist networks has, to date, been concerned mostly with networks that do not include hidden layers. Recent work has extended the theory to networks such as committee machines and parity machines; however these are not networks that are often used in practice and an important direction for current a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1995